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The Reiationship between Graduating 'Senipr Nominations 
of Valuable and Non-valuable Courses "and End-of-Course 

Student Ratings 



Getald M. Gillmore ^ 

The College of Arts and Sciences Senior Survey asked students 
to nominate their most and least valuable courses during their 
undergraduate careers. The end-Qf-course Student Ratings were , 
compared between forty courses rated as valuable and sixteen courses 
rated as non-valuable. All differences were statistically significant, 
with valuable courses gett^ing more favorable ratings on all items. 
Items which most strongly discriminated between the two groups tended 
to be those addressing broad edijcat^.onal outcomes, while items showing 
least discriminatiijn dealt with the mechanics of good teaching. 
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je. Relationship between X5ra^ua.t4ng Senior Nominations 
it Valuable and Uon-valuable Courses and End-|of-Covirse 
* ' Student Ratings / • 

> Gerald GilliDore • p 

• ■ 

The use of student ratings of instructional effectiveness has shown 
a marked increase in higher edutation. However, critics- and proponents 
alike are (or ought to be) wary of an over-emphasis upon this jingle 
source of -evaluative information at the exclusion of others* A statement 
"On the Techniques of Teacher Evaluation'* issued by the University of 
Washington Faculty Senate Committee on the Evaluation and Improvement of 
Teaching, contained the following:' "As important as student ratings are, 
however, they are simply part of the p^icture and no single technique can 
adequately measure a person's teaching contribution.' ' ' 

^Reasons, for th.e seeming over-reliance on ptu^ent ratings is probably 
two-fold; they are relatively easy to collect, and they ^are, psychomet- .* 
rically reliable. Other methods require greater expenditure. of valuable 
resources, such as faculty time, to obtain systematic and reliable 
infojnnation. 

There are 'twp' approaches to this probl^, with t;he approach chosen 

; having implications for how one views the validity of st^d^nt ratings 

data. If one views student ratings as only one source of data, and other 

sources are to 'be pursued with diligence, then the validity of student' 

ratings largely comes down' to a question of "Is the device collecting' 

inf ormatlpn- whi^h is an accurate appraisal of student opiiiion of the value 

of the. course at its* end?" • One could test this validity by coordinating 

student rating results with results obtained by concurrently administered 

alternative evaluational techniques. Correlations between end-of-conrse 

student ratings and measures from other sources or points in time would 

be interesting, especially in respect to learning 'about the concept of 

teaching ef fectiyeaess, but would h^iife little* to 's^y about the validity 

of student ratings per se * '// 

The alternative approach is to view studeqt ratings not only as a 
' • •* » ' 

valid measure as defined above but also as ,a* substitute for additional 

measures. This approach necessarily broadens the validity cjuestion 
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considerably, because now one is talking about the validity of student 
ratings as a measure of teaching effectiveness and. not just as a measure 
of student opinidn abaut teaching effectiveness. Within this approach, 
correlations between student ratings^ and other measures are direct indi-^ 
cators of the. concurrent validity of th^ method. 

Depending upon thp approach one wishes to. adopt, research to be 
presented here is a stddy of .the relationship between two measures of ^ 
, teaching effectiveness or a study of the validity of student ratings. 
This particular study concerns the point in time of the student Evalua- 
tion. One common criticism of end-of-course student ratings is that stu- 
dents do not have tlie necessary perspective to make an accurate assessment 
of 'the value of a course. Anyone a^ttending faculty discussions of student 
ratings is familiar with the mythical professor whose courses are (were) 
rated and maligned by the students whilB enrolled, but dearly loved and 
'respected by these same' students upon entry into professional life. The 
present study compares the ppinion of students at the end of^their senior 
year with opinions of students at, the end of the course. Presumably, at 
the end of the senior yeat, students can look back over their C9ur«e work 
with some greatei: perspective than they can have' at the end of each coutqe 
taken. The absence of 'a relationship between how valuable* a course is, ^ 
viewed by gradudi>ing seniors and the student -ratings that same c^ourse 
received, would either highlight the importance of systematically col- ' 
lecting i-nfofmation from gra'duating seniors for evaluating ijistructors ,an<J ^ 
courses or |throw the validity of student ratings into question. ... 



/ 

I ' ' Method 

I 

The Instruments' • 



The Seniot Survey . In June of 1974, questionnaires, were mailed to 
all b^ccaXa|ir^ate degree' candidates within the College of Arts^and , Sciences 
at the University of 'V/ashington. . (For a (Complete description of the ^ 
instrument ^and results, see de Wolf, Note 1). Contained in the question- 
naire were the iollowing three requests: 

1. P:^ease name three ciura^s and' instructors within your major / 
which now seem to have been most valuable in your education > ' 
at the UW. » . • 



N - • •. 5 . 



2. Please name three courses and instructors outside your major 
which now seem to have been most valuable in your education ' 
at the m. . 

3. Please name three cqurses and instructors which now seem to 
have 5>eea least valuable to your education at the UW. 

Responses to these three requests were the only data used from the Seniqr 
•Suryey in this study. • ' » 

^le University of_ Washingtt^n Survey of Student Opinion of Teaching . 
From 1968 until 1974, the standard form for collection .of student ^ratings • 
data at the University of ' Washington contained 24 items, only the first 15 
of which were used for this study. These items are found in* Table 1. 
Each item employs a five position response scale, with 1 being assigned to 
the most favorable position and 5 to the least. 
Subjects 

The Senior Survey was sent to 1,845 students, and returned by 898 or 

48 percent of the population. Students completing Bi^d-of-class student 

ratings were those enrolled in the specific dlass rated. Some subiects 

^ -J » ' 

completing the Senior Survey may have also been a part of the subjects <7ho 

»■ 

completed student ratings in some cases. However, the anonymity of .student 
ratings precludes any determination ^of this overlap, however it is probably 
negligible,. 
Selection of Classes 

The unit of analysis for this study was classes, not students^ 'In 
the Senior Survey a total of 2,641 course- instructor combinations were 
mentioned one or more times (about half, of these were mentioned only 
once)-. Courses mentioned without a specific instructor, and instructors 
•mentlpned without a specif ic course were eliminated from consideration. A 
»course-instructo-r combination^ which will be henceforth referred to a3 
simply course, could be nominated by a graduating senior under any one of 
the three requests: most valuable within, major, most valuable outside 

major, and least valuable. Thus, for each course, a resultant was calcu- 

• . * 

lated which summed number of nominations In the first two categories, and - 

subtracted the number in the third. A positive resultant is indicative of 



a "valuable'' class, a negative resultant is indicative of a **non-vaIuable*' 
class. The range of the resultants was +51 to -28 • 

. Prom these data, two groups of courses were formed. Valuable courses 
were defined as those having a resultant of +6 or greater. Non-valuable 
courses were defined as those having a resultant of -3 or less. The forme?; 
minimum was chosen to be twice as great in absolute value as the latter 
because of the ratio of two positive questions to one negative question. 
Thl^ procedure yielded 64 valuable course^ and ^30 non-valuable courses. 

Next, fjles were checked to see which courses had been rated by stu- 
dents using the standard University of Washington Student Rating form 
any time during the years 1968 to 1974. Of the 64 valuable coursea, 40 
had been rated. Of the 30 non-valuable courses, 16 had been rated. The 
difference in the proptortlon of courses found was not significant (x^ « -74, 
df 1) . In those cases where the same course had been rated for more than 
one offering, one particular section was chosen randomly. Thus, the final. 
sari(|)le consisted of 40 -valuable courses, and 16 non-valuable courses. 
Method of Comparison 

The two types of classes were compared on each of the 15 student 
rating items by use of t tests'. Also " Computed for each item was (u^ (Hays, 
1963, p. 327), which is an index of the strength of relationship. It is 
indicative of the proportion of the_variance of the dependent variable which 
is attributable to the independent vatiable. ' 

The reader should be cautioned that even though each of the 15 t test^ 
are independently computed, the 15 items of. the student ratings form are 
positively intercorrelated. Thus, results should not be interpreted as if 
there are 15 statistically independent dependent variables. 

Results and Discussion . 
Class Size * t 

The average class size was 82.9 students 'for the valuable courses, 
and 66,4 students for the non-valuable courses. This difference was 
non-sigijificanfc (t =• .389), probably in the most part due to the relatively 
large standard -deviations (67.3 and 55.3 respectively). These large aver- 
age class sizes are no doubt an artifact* of the selection method, , larger 



classes* are apt to get more nominations by sheer force of numbers. However 
this- result does show that a course does not require a small enrollment to 
be considered valuable. It also suggests that valuable and non-valuable 
courses are net differentiated by class size, e.g., large classes are not 
of value, while small classes are. ' 
Student Ratings Items 

The results of the student rating item comparisons are found in 
Table 1. 'f&e items have been arranged in order of the magnitude of _t 
value and 06 , • 

J As can be seefti in Table 1, the _t values for all items were- highly 

significant. The means show that the valuable group of classes were given 

a morfe favorable. average rating in every case. 
2 

The 0) 's ranged from .46 to .17, illustrating reasonably strong 
relationships. To give an alternative indication of the magnitude of the 
relationships, the frequency distribution of course means within the two 
groups for item 9, the item which exhibited the strongest relationship, 
is found in Table 2. The relatively small amount of overlap is readily 
apparent. 

It might be well t<3 remind the reader at this point that tliese 
comparisons are not between the graduating seniors' nominations of valu- 
able and non-valuable courses v/ith their ratings of the course at the time 
in ^hich they were enrolled. The latter* data are based on the ratings of 
a specific course offering and^ay contain a few of the sample of gradu- 
ating seniors, but would almost have to contain mostly^ stijdents not within 
the sample. In fact, no-aiM:empt was made to find the particular course 
offering in which the seniors were enrolled — if indeed all who nominated ' 

particular course were enrolled- in the same offering of it, e.g., Fall 
Quarter, 1972, as opposed to Fall Quartet, 1973. This is not considered 
a weakness of this study, howeVer. The uncontrolled variance resulting 
from choosing a particular course offering would necessarily add to the ^ 

error variance of the t^ tests and w 's and reduce the magnitude of the ' 

• 2 

_t values and u) 's by an unknown amount. For example, suppose a teacher 
offered the same course twice, and one offering was superior to the second. 
The superior offering would be more apt to be mentiorrfed as a valuable ' 

8 • 
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Table 2 ' 

• Frequency Distribution for Item 9 within Groups 

Frequency 

• * V 

Hean ' ' . Valuable Non-valuable 

courses (N=40) courses (N=16) 
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course but no more apt to 'be chosen *as the course 'within the sample 
analyzed. Therefore, highly significant and strong relationships formed 
might be considered all the more impressive. Certainly a strong relation 
ship between the alternative methods Is in strong evidence. 

The items in Table 1 were ordered by the size of the _t values and, 

2 ' 

equivalently , w 's in .Table 1. This was done in order t?o speculate from 

the data about what is important ift producing a class which will be con- 

sidered valuable by graduating seniors. This post-hoc analysis of/ a 

non-experimental study is fraught with danger, howevfer, and should be ap- 

2 

proacfied with caution, l^ile the _t values and.o) *s can legitimately be 
considered random variables, there is random fluctuation which is hard to 
take itito account. 

Be that as it may, those items at the top of the list and therefore 
" most discriminating amon^ the two groups appear to be those relating to 
broad> abstract educational outcomes; e.g.. Gave me new viewpoints or 
appreciations. Helped broaden my interests, and Motivated me to do my 
best. The least discriminating items seem to be those relating to spe-^ 
cific teaching behaviors, e.g.. Clear and understandable in explanations. 
Made good use of examples and illustrations, and Material presented in a 
well-organized fashion. 

This is a curious result in that it is items of the latter type that 
consistently show, up as most important for good teaching. But having the 
mechanics of good teaching per se is apparently not sufficient to have a 
courae chosen as one of - the three most valuable during' an entire under- 
graduate career, nor is poor techniques sufficient to have a course 
chosen as one of the three leaat valuable. These data suggest that the ^ 
most impressive courses are those providing new and fresh perspectives, 
which broaden and motivate the students. One could argue that the 
enthusiastic presentation of material (the second most discriminating 
item) requires something beyond just' clear explanation of the basic con- 
cepts t)f a field. Furthermore, the item ''Abstract ideas and theories 
were clearly interpreted" is higher on the list than "Clear and under- 
standable in explanations ' which also is possibly indicative of the 
instructor's success at going beyond t)ie basic data. 
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One final note on this question. The least discriminating item, 
"Helpful to Individual Students/' give6 indirect evidence that the path 
to being considered a valuable course by some numbet of ^'tudents is not 
necessarily through spending a lot of time with individual students. 
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